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Abstract. We propose a general method for converting online algorithms to local 
computation algorithms|j by selecting a random permutation of the input, and 
simulating running the online algorithm. We bound the number of steps of the 
algorithm using a query tree, which models the dependencies between queries. 
We improve previous analyses of query trees on graphs of bounded degree, and 
extend the analysis to the cases where the degrees are distributed binomially, and 
to a special case of bipartite graphs. 

Using this method, we give a local computation algorithm for maximal matching 
in graphs of bounded degree, which runs in time and space 0(log 3 n). 
We also show how to convert a large family of load balancing algorithms (related 
to balls and bins problems) to local computation algorithms. This gives several 
local load balancing algorithms which achieve the same approximation ratios as 
the online algorithms, but run in 0(log n) time and space. 

Finally, we modify existing local computation algorithms for hypergraph 2-coloring 
and fc-CNF and use our improved analysis to obtain better time and space bounds, 
of 0(log 4 n), removing the dependency on the maximal degree of the graph from 
the exponent. 



1 Introduction 

1.1 Background 

The classical computation model has a single processor which has access to a given 
input, and using an internal memory, computes the output. This is essentially the von 
Newmann architecture, which has been the driving force since the early days of com- 
putation. The class of polynomial time algorithms is widely accepted as the definition 
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of efficiently computable problems. Over the years many interesting variations of this 
basic model have been studied, focusing on different issues. 

Online algorithms (see, e.g., (7)) introduce limitations in the time domain. An online 
algorithm needs to select actions based only on the history it observed, without access to 
future inputs that might influence its performance. Sublinear algorithms (e.g. |[TT|[T5l ) 
limit the space domain, by limiting the ability of an algorithm to observe the entire 
input, and still strive to derive global properties of it. 

Local computation algorithms (LCAs) |fl~6l are a variant of sublinear algorithms. 
The LCA model considers a computation problem which might have multiple admis- 
sible solutions, each consisting of multiple bits. The LCA can return queries regarding 
parts of the output, in a consistent way, and in poly-logarithmic time. For example, the 
input for an LCA for a job scheduling problem consists of the description of n jobs 
and m machines. The admissible solutions might be the allocations of jobs to machines 
such that the makespan is at most twice the optimal makespan. On any query of a job, 
the LCA answers quickly the job's machine. The correctness property of the LCA guar- 
antees that different query replies will be consistent with some admissible solution. 

1.2 Our results 

Following J2), we use an abstract tree structure - query trees to bound the number of 
queries performed by certain algorithms. We use these bounds to improve the upper 
bound the time and space requirements of several algorithms introduced in Q- We also 
give a generic method of transforming online algorithms to LCAs, and apply it to obtain 
LCAs to maximal matching and several load balancing problems. 

1.2.1 Bounds on query trees Suppose that we have an online algorithm where the 
reply to a query depends on the replies to a small number of previous queries. The reply 
to each of those previous queries depends on the replies to a small number of other 
queries and so on. These dependencies can be used to model certain problems using 
query trees - trees which model the dependency of the replies to a given query on the 
replies to other queries. 

Bounding the size of a query tree is central to the analyses of our algorithms. We 
show that the size of the query tree is 0(log n) w.h.p., where n is the number of vertices. 
d, the degree bound of the dependency graph, appears in the constant.0This answers in 
the affirmative the conjecture of 0. Previously, Alon et al. [2] show that the expected 
size of the query tree is constant, and 0(log d+1 n) w.h.p@ Our improvement is signif- 
icant in removing the dependence on d from the exponent of the logarithm. We also 
show that when the degrees of the graph are distributed binomially, we can achieve the 
same bound on the size of the query tree. In addition, we show a trivial lower bound of 
J?(logn/log logn). 

4 Note that, however, the hidden constant is exponentially dependent on d. Whether or not this 
bound can be improved to have a polynomial dependency on d is an interesting open question. 

5 Notice that bounding the expected size of the query tree is not enough for our applications, 
since in LCAs we need to bound the probability that any query fails. 
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We use these results on query trees to obtain LCAs for several online problems - 
maximal matching in graphs of bounded degree and several load balancing problems. 
We also use the results to improve the previous algorithms for hypergraph 2-coloring 
and fc-CNF. 

1.2.2 Hypergraph 2-coloring We modify the algorithm of [2;| for an LCA for hyper- 
graph 2-coloring, and coupled with our improved analysis of query tree size, obtain an 
LCA which runs in time and space (9(log 4 n), improving the previous result, an LCA 
which runs 0(log d+1 n) time and space. 

1.2.3 fc-CNF Building on the similarity between hypergraph 2-coloring and fc-CNF, 
we apply our results on hypergraph 2-coloring to give an an LCA for fc-CNF which runs 
in time and space 0(log 4 n). 

We use the query tree to transform online algorithms to LCAs. We simulate online 
algorithms as follows: first a random permutation of the items is generated on the fly. 
Then, for each query, we simulate the online algorithm on a stream of input items ar- 
riving according to the order of the random permutation. Fortunately, because of the 
nature of our graphs (the fact that the degree is bounded or distributed binomially), we 
show that in expectation, we will only need to query a constant number of nodes, and 
only 0(log n) nodes w.h.p. We now state our results: 

1.2.4 Maximal matching We simulate the greedy online algorithm for maximal 
matching, to derive an LCA for maximal matching which runs in time and space 0(log 3 n). 

1.2.5 Load Balancing We give several LCAs to load balancing problems which run 
in 0(log n) time and space. Our techniques include extending the analysis of the query 
tree size to the case where the degrees are selected from a binomial distribution with 
expectation d, and further extending it to bipartite graphs which exhibit the characteris- 
tics of many balls and bins problems, specifically ones where each ball chooses d bins 
at random. We show how to convert a large class of the "power of d choices" online 
algorithms (see, e.g., GHUED) to efficient LCAs. 

1.3 Related work 



Nguyen and Onak 1131 focus on transforming classical approximation algorithms into 
constant-time algorithms that approximate the size of the optimal solution of problems 
such as vertex cover and maximum matching. They generate a random number r 6 
[0, 1], called the rank, for each node. These ranks are used to bound the query tree size. 

Rubinfeld et al. Ifl6l show how to construct polylogarithmic time local computa- 
tion algorithms to maximal independent set computations, scheduling radio network 
broadcasts, hypergraph coloring and satisfying k-SAT formulas. Their proof technique 
uses Beck's analysis in his algorithmic approach to the Lovasz Local Lemma (4], and 
a reduction from distributed algorithms. Alon et al. [2j, building on the technique of 
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ff3l . show how to extend several of the algorithms of |[T6l to perform in polylogarith- 
mic space as well as time. They further observe that we do not actually need to assign 
each query a rank, we only need a random permutation of the queries. Furthermore, 
assuming the query tree is bounded by some k, the query to any node depends on at 
most k queries to other nodes, and so a fe-wise independent random ordering suffices. 
They show how to construct a 1/ n 2 -almost fc-wise independent random ordering^ from 
a seed of length 0(fclog 2 n). 

Recent developments in sublinear time algorithms for sparse graph and combinato- 
rial optimization problems have led to new constant time algorithms for approximating 
the size of a minimum vertex cover, maximal matching, maximum matching, minimum 
dominating set, and other problems (cf. lfT5l[TT1[T3ll20l ). by randomly querying a con- 
stant number of vertices. A major difference between these algorithms and LCAs is 
that LCAs require that w.h.p., the output will be correct on any input, while optimiza- 
tion problems usually require a correct output only on most inputs. More importantly, 
LCAs reuire a consistent output for each query, rather than only approximating a given 
global property. 

There is a vast literature on the topic of balls and bins and the power of d choices, 
(e.g. (3]|6]|9][T8)). For a survey on the power of d choices, we refer the reader to lfl"2l . 

1.4 Organization of our paper 

The rest of the paper is organized as follows: Some preliminaries and notations that we 
use throughout the paper appear in Section|2] In Section|3]we prove the upper bound of 
0(log n) on the size of the query tree in the case of bounded and binomially distributed 
degrees. In section|4] we use this analysis to give improved algorithms for hypergraph 
2-coloring and fc-CNF. In Section [5] we give an LCA for finding a maximal matching 
in graphs of bounded degree. Section |6]expands our query tree result to a special case 
of bipartite graphs; we use this bound for bipartite graph to convert online algorithms 
for balls and bins into LCAs for the same problems. The appendices provide in-depth 
discussions of the hypergraph 2-coloring and analogous fc-CNF LCAs, and a lower 
bound to the query tree size. 

2 Preliminaries 

Let G = (V, E) be an undirected graph. We denote by N G (v) = {u G V(G) : (u, v) G 
E(G)} the neighbors of vertex v, and by degc(v) we denote the degree of v. When it 
is clear from the context, we omit the G in the subscript. Unless stated otherwise, all 
logarithms in this paper are to the base 2. We use [n] to denote the set {1, ... , n}, where 
n > 1 is a natural number. 

We present our model of local computation algorithms (LCAs): Let F be a com- 
putational problem and x be an input to F. Let F(x) = {y | y is a valid solution 
for input x}. The search problem for F is to find any y G F(x). 



6 A random ordering D r is said to be e-almost k-wise independent if the statistical distance 
between D r and some fc-wise independent random ordering by at most e. 



5 



A (t(n),s(n),S(n))-local computation algorithm A is a (randomized) algorithm 
which solves a search problem for F for an input x of size n. However, the LCA A 
does not output a solution y £ F(x), but rather implements query access to y G ^(a;). 
A receives a sequence of queries ii, . . . , i q and for any q > satisfies the following: (1) 
after each query ij it produces an output yi , (2) With probability at least 1 — S(n) A is 
consistent, that is, the outputs y^ , . . . , yi are substrings of some y G F(x). (3) A has 
access to a random tape and local computation memory on which it can perform current 
computations as well as store and retrieve information from previous computations. 

We assume that the input x, the local computation tape and any random bits used 
are all presented in the RAM word model, i.e., A is given the ability to access a word 
of any of these in one step. The running time of A on any query is at most t(n), which 
is sub linear in n, and the size of the local computation memory of A is at most s(n). 
Unless stated otherwise, we always assume that the error parameter 5(n) is at most 
some constant, say, 1/3. We say that A is a strongly local computation algorithm if 
both t(n) and s(n) are upper bounded by 0(log c n) for some constant c. 

Two important properties of LCAs are as follows. We say an LCA A is query order 
oblivious (query oblivious for short) if the outputs of A do not depend on the order of 
the queries but depend only on the input and the random bits generated on the random 
tape of A. We say an LCA A is parallelizable if A supports parallel queries, that is A 
is able to answer multiple queries simultaneously so that all the answers are consistent. 

3 Bounding the size of a random query tree 
3.1 The problem and our main results 

In online algorithms, queries arrive in some unknown order, and the reply to each query 
depends only on previous queries (but not on any future events). The simplest way to 
transform online algorithms to LCAs is to process the queries in the order in which they 
arrive. This, however, means that we have to store the replies to all previous queries, 
so that even if the time to compute each query is polylogarithmic, the overall space is 
linear in the number of queries. Furthermore, this means that the resulting LCA is not 
query-oblivious. The following solution can be applied to this problem ( 0131 and (2]): 
Each query v is assigned a random number, r(v) G [0,1], called its rank, and the queries 
are performed in ascending order of rank. Then, for each query x, a query tree can be 
constructed, to represent the queries on which x depends. If we can show that the query 
tree is small, we can conclude that each query does not depend on many other queries, 
and therefore a small number of queries need to be processed in order to reply to query 
x. We formalize this as follows: 

Let G = (V, E) be an undirected graph. The vertices of the graph represent queries, 
and the edges represent the dependencies between the queries. A real number r(v) G 
[0, 1] is assigned independently and uniformly at random to every vertex v G V; we call 
r(v) the rank of v. This models the random permutation of the vertices. Each vertex 
v G V holds an input x(v) G R, where the range R is some finite set. The input is the 
content of the query associated with v. A randomized function F is defined inductively 
on the vertices of G such that F{v) is a (deterministic) function of x{v) as well as the 
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values of F at the neighbors w of v for which r(w) < r(v). F models the output of 
the online algorithm. We would like to upper bound the number of queries to vertices 
in the graph needed in order to compute F(vq) for any vertex vq £ G, namely, the time 
to simulate the output of query vq using the online algorithm. 

To upper bound the number of queries to the graph, we turn to a simpler task of 
bounding the size of a certain d-regular tree, which is an upper bound on the number of 
queries. Consider an infinite d-regular tree T rooted at vq. Each node w in Tis assigned 
independently and uniformly at random a real number r(w) G [0, 1]. For every node w 
other than vq in T, let parent(u>) denote the parent node of w. We grow a (possibly 
infinite) subtree T of T rooted at v as follows: a node w is in the subtree T if and only 
if parent(ui) is in T and r(w) < r(parent(u;)) (for simplicity we assume all the ranks 
are distinct real numbers). That is, we start from the root vq, add all the children of vq 
whose ranks are smaller than that of vo to T. We keep growing T in this manner where 
a node w' G T is a leaf node in T if the ranks of its d children are all larger than r(w'). 
We call the random tree T constructed in this way a query tree and we denote by \T\ 
the random variable that corresponds to the size of T. Note that |T| is an upper bound 
on the number of queries since each node in T has at least as many neighbors as that in 
G and if a node is connected to some previously queried nodes, this can only decrease 
the number of queries. Therefore the number of queries is bounded by the size of T. 
Our goal is to find an upper bound on \T\ which holds with high probability. 

We improve the upper bound on the query tree of 0(log d+1 N) given in |2] for 
the case when the degrees are bounded by a constant d and extend our new bound to 
the case that the degrees of G are binomially distributed, independently and identically 
with expectation d, i.e., deg(v) ~ B(n, d/n). 

Our main result in this section is bounding, with high probability, the size of the 
query tree T as follows. 

Lemma 1. Let G be a graph whose vertex degrees are bounded by d or distributed 
independently and identically from the binomial distribution: deg(v) ~ B(n,d/n). 
Then there exists a constant C (d) which depends only on d, such that 

Pr[|T| > C(d) log n] < l/n 2 , 

where the probability is taken over all the possible permutations 7r G U of the vertices 
of G, and T is a random query tree in G under n. 

3.2 Overview of the proof 

Our proof of Lemma[TJconsists of two parts. Following we partition the query tree 
into levels. The first part of the proof is an upper bound on the size of a single (sub)tree 
on any level. For the bounded degree case, this was already proved in J2) (the result is 
restated as Proposition [Q. 

We extend the proof of to the binomially distributed degrees case. In both cases 
the bound is that with high probability each subtree is of size at most logarithmic in the 
size of the input. 

The second part, which is a new ingredient of our proof, inductively upper bounds 
the number of vertices on each level, as the levels increase. For this to hold, it crucially 
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depends on the fact that all subtrees are generated independently and that the probability 
of any subtree being large is exponentially small. The main idea is to show that although 
each subtree, in isolation, can reach a logarithmic size, their combination is not likely 
to be much larger. We use the distribution of the sizes of the subtrees, in order to bound 
the aggregate of multiple subtrees. 

3.3 Bounding the subtree size 

As in 13, we partition the query tree into levels and then upper bound the probability 
that a subtree is larger than a given threshold. Let L > 1 be a function of d to be 
determined later. First, we partition the interval [0,1] into L sub-intervals: I t = (1 — 
■j-q-j-, 1 — fqrj-], for i = 1, 2, • ■ • ,L and Il+i = [0, jxi]- We refer to interval Zj as 
level i. A vertex v € T is said to be on level i if r(v) <E 1{. We consider the worst case, 
in which r(vo) E I\. In this case, the vertices on level 1 form a tree T\ rooted at vq. 
Denote the number of (sub)trees on level i by ij. The vertices on level 2 will form a 
set of trees {T^ , • • • , T^* 2 '}, where the total number of subtrees is at most the sum 
of the children of all the vertices in T\ (we only have inequality because some of the 
children of the vertices of T\ may be assigned to levels 3 and above.) The vertices on 
level i > 1 form a set of subtrees {Tf\ ■ ■ ■ T- U ^}. Note that all these subtrees {7^ } 
are generated independently by the same stochastic process, as the ranks of all nodes in 
T are i.i.d. random variables. In the following analysis, we will set L = 3d. 

For the bounded degree case, bounding the size of the subtree follows from Q: 

Proposition 1 (|j2]|).0£e?Z/ > d+1 be a fixed integer and let T be the d-regular infinite 
query tree. Then for any 1 < i < L and 1 < j < U, Pr[|^ (i) | > n] < J2™n ^ 
2 _ ^(") ) far all n > f3, where /3 is some constant. In particular, there is an absolute 
constant cq depending on d only such that for all n > 1, 

Prfll^l > n] < e- c ° n . 

3.3.1 The binomially distributed degrees case We are interested in bounding the 
subtree size also in the case that the degrees are not a constant d, but rather selected 
independently and identically from a binomial distribution with mean d. 

Proposition 2. Let T be a tree with vertex degree distributed i.i.d. binomially with 

deg(v) ~ B(n,d/n). For any 1 < i < L and any 1 < j < U, Pr[\T^\ > n] < 
J2Zn 2 ~ a < 2~ n{n \forn> /3, for some constant f3 > 0. 

Proof. The proof of Proposition's similar to the proof of Proposition Q]in J2] ; we em- 
ploy the theory of Galton- Watson processes. For a good introduction to Galton- Watson 
branching processes see e.g. IfTUI - 

Consider a Galton- Watson process defined by the probability function p := {pt',k = 
0,1,2,...}, with Pk > and = 1. Let f(s) = EZoPkS k be the gen- 

erating function of p. For i — 0, 1, . . . , let Z$ be the number of offsprings in the 

7 In |2l. this lemma is proved for the case of L — d+1. This immediately establishes Proposition 
Q] since the worse case is L = d + 1. 
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i th generation. Clearly Zq = 1 and {Zi : i = 0, 1, . . .} form a Markov chain. Let 
m := E[Zi] = J2k kpk be the expected number of children of any individual. Let 
Z = Zq + Z\ + ■ ■ ■ be the sum of all offsprings in all generations of the Galton- Watson 
process. The following result of Otter is useful in bounding the probability that Z is 
large. 

Theorem 1 ( 11411 ). Suppose po > and that there is a point a > within the circle of 
convergence of f for which af'(a) = /(a). Let a = a/f(a). Let t = gcd{r : p r > 0}, 
where gcd stands for greatest common divisor. Then 



Pr[Z = n] = 




- 5 / 2 ), ifn = 1 (mod t); 
ifn ^ 1 (mod t). 



In particular, if the process is non-arithmetic, i.e. gcd{r : p r > 0} = 1, and a jfi^ is 
finite, then 

Pv[Z = n] = 0{a- n n-' i ' 2 ), 
and consequently Pr[Z > n] = 0(a~ n ). 

We prove Proposition [2] for the case of tree T\ - the proof actually applies to all 
subtrees 7$ . Recall that T\ is constructed inductively as follows: for v G N(vq) in 
T, we add v to T\ if r(v) < r(vo) and r(v) e I\. Then for each v in Ti, we add the 
neighbors w G N(v) in T to Ti if r(w) < r(v) and r(w) G 7i. We repeat this process 
until there is no vertex that can be added to T\ . 

Once again, we work with the worst case that r(vo) = 1. To upper bound the size 
of Ti, we consider a related random process which also grows a subtree of T rooted at 
Vq, and denote it by T[. The process that grows T[ is the same as that of Ti except for 
the following difference: if v G T[ and w is a child vertex of v in T, then we add w to 
T[ as long as r(ui) G Ij.. In other words, we give up the requirement that r(w) < r(v). 
Clearly, we always have Ti C T[ and hence l^'l > |Ti|. 

Note that the random process that generates T{ is in fact a Galton- Watson process, 
as the rank of each vertex in T is independently and uniformly distributed in [0, 1]. We 
take vertex v to be the parent node. Since = 1/L, then for any vertex u G V(G), 
u v, the probability that it is a child node of v in T{ is 

d/n ■ 1/L = d/nL, 

as the random process that connects w to v and the random process that generates the 
rank of w are independent (each edge is chosen with probability d/n, and the probability 
that r(w) is in I v = 1/L). It follows that we have a binomial distribution for the number 
of child nodes of v in T[: 



P = {(1 - 9 ) n , (") 9(1 - ff)"" 1 , (2)^(1 - q) n - 2 , Q n }, 
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where q := d/nL is the probability that a child vertex in T appears in T[ when its 
parent vertex is in T[. Note that the expected number of children of a vertex in T[ is 
nq = d/L < 1, so from the classical result on the extinction probability of Galton- 
Watson processes (see e.g. iflOl ). the tree T[ is finite with probability one. 
The generating function of p is 

f( S ) = (l-q + q S y\ 

as the probability function {pk} obeys the binomial distribution pk = Pr[X = k] where 
X ~ B(n, q). In addition, the convergence radius of / is p = oo since {pk} has only a 
finite number of non-zero terms. 



f(s)=nq(l-q + qs) n - 1 

Solving the equation af'(a) = f(a) yields anq{\ — q + qa) 11 ^ 1 = (1 — q + qa) r 
and hence anq = 1 — q + qa. Consequently, solving for a gives 



a = 



q(n-l) 

1 - d/nL 
d(n — l)/nL 
nL — d 



d(n - 1) 
3n- 1 

71—1 



We can lower bound a as 



1 



a 



/(«) /'(«) 
L 



dil-q + q^ErY 1 - 1 
3 



1 



3(n-l) 



3 

Finally we calculate f"(a): 

f"(a) = qMn-l)(l-q + qa) n - 2 



d 2 n{n-l) ( d | d 3n- l x " 



71-1 



n 2 L 2 V nL nL n — 1 

n-2 



1 



9n V 3 (" - !) 
= 0(1), 
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therefore . °, s is a bounded constant. 

af"(a) 

Now applying Theorem Q] to the Galton-Watson process which generates T[ (note 
that t = 1 in our case) gives that, there exists a constant no such that for n > uq, 
Pr[\T{\ = n] < 2~ cn for some constant c > 0. It follows that Pr[|T{| > n] < 
Efcn 2_ci < 2- fi W for all n > n . Hence for all large enough n, with probabil- 
ity at least 1-1 /rr\ |Ti | < |T( | = 0(log n). □ 

The following corollary stems directly from Propositions [T]and|2] 

Corollary 1. Let 7" be any infinite d-regular query tree or tree with vertex degree dis- 
tributed i.i.d. binomially with deg(v) ~ B(n,d/n). For any 1 < i < L and any 
1 < j < ii, with probability at least 1-1 /n 3 , \T^' \ = 0(\og n). 



3.4 Bounding the increase in subtree size as we go up levels 

From Corollary Q] we know that the size of any subtree, in particular |Ti|, is bounded 
by O(logn) with probability at least 1 — 1/n 3 in both the degree d and the binomial 
degree cases. Our next step in proving Lemma Q] is to show that, as we increase the 
levels, the size of the tree does not increase by more than a constant factor for each 
level. That is, there exists an absolute constant r) depending on d only such that if the 
number of vertices on level k is at most \Tk |, then the number of vertices on level k + 1, 
|T fc+ i| satisfies |T fc+ i| < vYh=i + O(logn) < 2i]\T k \ + O(logn). Since there 
are L levels in total, this implies that the number of vertices on all L levels is at most 
0((2 V ) L logn) = O(logn). 

The following Proposition establishes our inductive step. 

Proposition 3. For any infinite query tree T with constant bounded degree d (or de- 
grees i.i.d. ~ B(n,d/n)), for any 1 < i < L, there exist constants r\\ > and 

m > s.t. ifY.%1 < Vilogn then Pr[E*S l^+il > "i^logn] < 1/n 2 

for all n > ft, for some (3 > 0. 

Proof. Denote the number of vertices on level k by Zu and let = YLi=i Assume 
that each vertex i on level < k is the root of a tree of size Zi on level k + 1. Notice that 

Zk+l = J2itl Z i- 

By Proposition[TJ(or Proposition^, there are absolute constants Co and (3 depending 
on d only such that for any subtree on level k and any n > (3, Pr[|T^ ? ' 1 1 = n] < 
e -c n Therefore, given (zi, . . . , zy k ), the probability of the forest on level fc+1 consist- 
ing of exactly trees of size (zi, . . . , zy k ) is at most J|i=i e~ c °( Zi ~^ = e~ c " i - Zk+1 ~P Yk \ 

Notice that, given Yfc (the number of nodes up to level k), there are at most ( Zk+ Y ^£ 
< ( Zfc+ y fc fifc ) vectors (zi, ...,zy k ) that can realize Z^+i- 
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We want to bound the probability that Z k+ i = nY k for some (large enough) con- 
stant n > 0. We can bound this as follows: 



Pr[|T fc+ i| = Z k+1 ] < 



e -c (Z k + 1 -l3Y k ) 



e- (Z fc+ i +Y k ) \ Y " -co(z k+1 -pY k ) 



= (e(l + r])) Yk e- C0 ^-P) Y « 



e n-(-co(»)-/3)+ln(r,+ l) + l) 



It follows that there is some absolute constant c' which depends on d only such 
that Pr[|T fc+ i > rjY k ] < e- c>f > Yk . That is, if rjY k = /2(log n), the probability that 
|Tfc+i| > is at most 1/n 3 . Adding the vertices on all L levels and applying the 
union bound, we conclude that with probability at most 1/n 2 , the size of T is at most 
O(logn). □ 



4 Hypergraph 2-coloring and fc-CNF 

We use the bound on the size of the query tree of graphs of bounded degree to improve 
the analysis of for hypergraph 2-coloring. We also modify their algorithm slightly 
to further improve the algorithm's complexity. As the algorithm is a more elaborate 
version of the algorithm of J3J and the proof is somewhat long, we only state our main 
theorem for hypergraph 2-coloring; we defer the proof to Appendix lAl 

Theorem 2. Let H be a k-uniform hypergraph s.t. each hyperedge intersects at most d 
other hyperedges. Suppose that k > 16 log d + 19. 

Then there exists an (0(log n), 0(log 4 n), \ / n)-local computation algorithm which, 
given H and any sequence of queries to the colors of vertices (xi, X2, ■ ■ ■ , x s ), with 
probability at least 1 — 1 /n 2 , returns a consistent coloring for all Xi 's which agrees 
with a 2-coloring of H. Moreover, the algorithm is query oblivious and parallelizable. 

Due to the similarity between hypergraph 2-coloring and fc-CNF, we also have the 
following theorem; the proof is in Appendix IE1 

Theorem 3. Let H be a k-CNF formula with k > 2. Suppose that each clause inter- 
sects no more than d other clauses, and furthermore suppose that k > 16 log d + 19. 
Then there exists a (0(log n),0(log n),l/n)-local computation algorithm which, 
given a formula H and any sequence of queries to the truth assignments of variables 
(xi, X2, ■ ■ ■ , Xs), with probability at least 1 — 1/n 2 , returns a consistent truth assign- 
ment for all Xi's which agrees with some satisfying assignment of the k-CNF formula 
H. Moreover, the algorithm is query oblivious and parallelizable. 
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5 Maximal matching 

We consider the problem of maximal matching in a bounded-degree graph. We are 
given a graph G = (V, E), where the maximal degree is bounded by some constant d, 
and we need to find a maximal matching. A matching is a set of edges with the property 
that no two edges share a common vertex. The matching is maximal if no other edge 
can be added to it without violating the matching property. 

Assume the online scenario in which the edges arrive in some unknown order. The 
following greedy online algorithm can be used to calculate a maximal matching: When 
an edge e arrives, we check whether e is already in the matching. If it is not, we check 
if any of the neighboring edges are in the matching. If none of them is, we add e to the 
matching. Otherwise, e is not in the matching. 

We turn to the local computation variation of this problem. We would like to query, 
for some edge e 6 E, whether e is part of some maximal matching. (Recall that all 
replies must be consistent with some maximal matching). 

We use the technique of J2] to produce an almost 0(log?i)-wise independent ran- 
dom ordering on the edges, using a seed length of 0(log 3 n)0 When an edge e is 
queried, we use a BFS (on the edges) to build a DAG rooted at e. We then use the 
greedy online algorithm on the edges of the DAG (examining the edges with respect to 
the ordering), and see whether e can be added to the matching. 

As the query tree is an upper-bound on the size of the DAG, we derive the following 
theorem from Lemma Q] 

Theorem 4. Let G — (V, E) be an undirected graph with n vertices and maximum 
degree d. Then there is an (0(log 3 n), 0(log 3 n), 1/n) - local computation algorithm 
which, on input an edge e, decides if e is in a maximal matching. Moreover, the algo- 
rithm gives a consistent maximal matching for every edge in G. 

6 The bipartite case and local load balancing 

We consider a general "power of d choices" online algorithm for load balancing. In this 
setting there are n balls that arrive in an online manner, and m bins. Each ball selects a 
random subset of d bins, and queries these bins. (Usually the query is simply the current 
load of the bin.) Given this information, the ball is assigned to one of the d bins (usually 
to the least loaded bin). We denote by LB such a generic algorithm (with a decision rule 
which can depend in an arbitrary way on the d bins that the ball is assigned to). Our 
main goal is to simulate such a generic algorithm. 

The load balancing problem can be represented by a bipartite graph G = ({V, U}, E), 
where the balls are represented by the vertices V and the bins by the vertices U. The 
random selection of a bin u G U by a ball v € V is represented by an edge. By defi- 
nition, each ball v 6 V has degree d. Since there are random choices in the algorithm 
LB we need to specify what we mean by a simulation. For this reason we define the 
input to be the following: a graph G = ({V, U}, E), where \V\ = n, \U\ = m, and 

8 Since the query tree is of size O(logn) w.h.p., we don't need a complete ordering on the 
vertices; an almost 0(log ?i)-wise independent ordering suffices. 
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n = an for some constant c > 1. We also allocate a rank r(u) G [0, 1] to every u G U. 
This rank represents the ball's arrival time: if r(v) < r(u) then vertex v arrived before 
vertex u. Furthermore, all vertices can have an input value x(w). (This value represents 
some information about the node, e.g., the weight of a ball.) Given this input, the al- 
gorithm CB is deterministic, since the arrival sequence is determined by the ranks, and 
the random choices of the balls appear as edges in the graph. Therefore by a simulation 
we will mean that given the above input, we generate the same allocation as CB. 

We consider the following stochastic process: Every vertex v G V uniformly and 
independently at random chooses d vertices in U. Notice that from the point of view 
of the bins, the number of balls which chose them is distributed binomially with X ~ 
B(n, d/m). Let X v and X u be the random variables for the number of neighbors of 
vertices v G V and u 6 U respectively. By definition, X v = d, since all balls have d 
neighbors, and hence each X u is independent of all X v 's. However, there is a depen- 
dence between the X^'s (the number of balls connected to different bins). Fortunately 
this is a classical example where the random variables are negatively dependent (see 
e.g. 

6.1 The bipartite case 

Recall that in Section [3] we assumed that the degrees of the vertices in the graph were 
independent. We would like to prove an O(logn) upper bound on the query tree T 
for our bipartite graph. As we cannot use the theorems of Section [3] directly, we show 
that the query tree is smaller than another query tree which meets the conditions of our 
theorems. 

The query tree for the binomial graph is constructed as follows: a root vq G V is 
selected for the tree, (vq is the ball whose bin assignment we are interested in deter- 
mining.) Label the vertices at depth j in the tree by Wj. Clearly, Wo = {vq}. At each 
depth d, we add vertices one at a time to the tree, from left to right, until the depth is 
"full" and then we move to the next depth. Note that at odd depths (2j + 1) we add bin 
vertices and at even depths (2j) we add ball vertices. 

Specifically, at odd depths (2j + 1) we add, for each v G W^j its d neighbors 
u G N(v) as children, and mark each by u0 At even depths (2j) we add for each 
node marked by u G \V2j-1 all its (ball) neighbors v G N(u) such that r(v) < 
r(parent(u)), if they have not already been added to the tree. Namely, all the balls 
that are assigned to u by time 

A leaf is a node marked by a bin ui for whom all neighboring balls v G N(ug) — 
{parent(ue)} have a rank larger than its parent, i.e., r(v) > r(j>arent(ue)). Namely, 
parent(ug) is the first ball to be assigned to bin ug. This construction defines a stochas- 
tic process F = {F t }, where F t is (a random variable for) the size of T at time t. (We 
start at t = and t increases by 1 for every vertex we add to the tree). 

We now present our main lemma for bipartite graphs. 

9 We remind the reader that two random variables Xi and X2 are negatively dependent if 
Prpfi > x\X2 = a] < Pr[Xi > x\X2 = b], for a > b and vice- versa. 

10 A bin can appear several times in the tree. It appears as different nodes, but they are all marked 
so that we know it is the same bin. Recall that we assume that all nodes are unique, as this 
assumption can only increase the size of the tree. 
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Lemma 2. Let G = ({V, U},E) be a bipartite graph, \V\ — n and \U\ = m and 
n = cm for some constant c > 1, such that for each vertex v € V there are d edges 
chosen independently and at random between v and U. Then there is a constant C (d) 
which depends only on d such that 

Pt[\T\ < C(d) log n] > 1 - 1/n 2 , 

where the probability is taken over all of the possible permutations ir E U of the 
vertices of G, and T is a random query tree in G under 7r. 

To prove Lemma [2] we look at another stochastic process F', which constructs 
a tree T": we start with a root v' . Label the vertices at depth j in the tree by Wj. 
Assign every vertex y that is added to the tree a rank r(y) 6 [0,1] independently and 
uniformly at random. Similarly to T, Wq = {v' Q }. At odd depths (2j + 1) we add to 
each v' 6 Wjy> ^ children (from left to right). At even depths (2j) we add to each node 
v! G W^j-i, X' u , children, where X' u , ~ B(n, 2d/m) and the X' u , of different nodes 
are i.i.d. Of the nodes added in this level, we remove all those vertices y' for which 
r(y') > r(jparent(parent(y'))). 

Importantly, the neighbor distributions of the vertices in the tree are independent of 
each other. If at any point T" has "more than half the bins", i.e., the sum of nodes on 
odd levels is at least m/2, we add n + m bin children of rank to some even-level node 
in the tree. 

Given a tree T we define squash(T) to be the tree T with the odd levels deleted, and 
a node v in level 2j is connected to node v' in level 2j + 2 if v = parent(parent(v')). 

Lemma 3. There is a constant C(d) which depends only on d such that for all large 
enough n, Pr[\squash(T')\ > C(d)\ogn] < 1/n 2 . 

Because d- \ squash(T') \ > \T'\ > \squash(T')\, we immediately get the following 
corollary: 

Corollary 2. There is a constant C(d) which depends only on d such that for all large 
enoughn, Pr[|T'| > C(d)\ogn] < 1/n 2 . 

We first make the following claim: 

Claim. squash{T') has vertex degree distributed i.i.d. binomially with deg(v) ~ B(dn, 
2d/m). 

Proof. Each v € W£j has d children, each with degree distributed binomially ~ B(n,2d/m). 
For any independent r.v.'s Yy, Y%, ■ ■ ■ where Vi > 0, Yi ~ B(n,p), we know that 

Q 

y2 Yi ~ B(qn, p). The Claim follows. □ 

i=i 

We can now turn to the proof of Lemma|3] 

Proof. As long as \squash(T')\ < m/2, the proof of the lemma follows the proof of 
Lemma[T]with slight modifications to constants and will therefore be omitted. 
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We notice that \squash(T')\ < m/2 w.p. at least 1 — 1/n 2 : the proof of Lemma 
Q]is inductive - we show that at level I±, the size of the subtree is at most O(logn), 
and then bound the increase in tree size as we move to the next level. By the level II, 
\squash(T')\ = O(logn) w.p. at least 1 — 1 /n 2 . Therefore it follows that \squash(T')\ < 
m/2 w.p. at least 1 — 1/n 2 . □ 

Before we can complete the proof of Lemma|2] we need to define the notion of first 
order stochastic dominance: 

Definition 1 (First order stochastic dominance). We say a random variable X first 
order stochastically dominates (dominates for short) a random variable Y if Pi[X > 
a] > Pr[y > a] for all a and Pr[X > a] > Pr[Y > a] for some a. If X dominates Y, 
then we write X > Y. 

Lemma 4. For every t, F[ first-order stochastically dominates F t . 

Proof. Assume we add a (bin) vertex u G U to T at time t, the random variable for the 
number of it's neighbors is negatively dependent on all other X w , w G T t P\U. We label 
this variable X = X U \{X W }, w G T t . 

We first show that F[ > Ft when T has less than m/2 bins, and then show that 
F[ > F t when T' has more than m/2 bins. (It is easy to see why this is enough). 
Assume \T t nU\ < m/2. X u is dependent on at most m/2 otherrandom variables, X w . 
Because the dependency is negative, X u is maximized when Vw, X w = 0. Therefore, 
in the worst case, X u is dependent on m/2 bins with children. If m/2 bins have 
children, all edges in G must be distributed between the remaining bins. Therefore 
X < X' u , where X' u - B(n, 2d/m). 

When T' has more than m/2 bins, by the construction of F[, it has more than m + n 
vertices, and so F[ trivially dominates F t . □ 

Combining Corollary |2] and Lemma|4]completes the proof of Lemma[2] 



6.2 Local load balancing 

The following theorem states our basic simulation result. 

Theorem 5. Consider a generic online algorithm CB which requires constant time per 
query, for n balls and m bins, where n = cm for some constant c > 0. There exists 
an (0(log n), 0(log n), l/n)-local computation algorithm which, on query of a (ball) 
vertex v G V, allocates v a (bin) vertex u G U, such that the resulting allocation is 
identical to that of CB with probability at least 1 — 1 /n. 

Proof. Let K = C(d) log \ U\ for some constant C(d) depending only on d. K is the 
upper bound given in Lemma [2] (In the following we make no attempt to provide the 
exact values for C(d) or K.) 

We now describe our (O(logn), O(logn), l/n)-local computation algorithm for 
CB. A query to the algorithm is a (ball) vertex vq G V and the algorithm will chose a 
(bin) vertex from the d (bin) vertices connected to Vq. 
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We first build a query tree as follows: Let vq be the root of the tree. For every 
u G N(uo), add to the tree the neighbors of u, v e V such that r(v) < t(vq). Continue 
inductively until either K nodes have been added to the random query tree or no more 
nodes can be added to it. If K nodes have been added to the query tree, this is a failure 
event, and assign to vq a random bin in N(vq). From Lemma [2] this happens with 
probability at most l/?i 2 , and so the probability that some failure event will occur is at 
most 1 / n. Otherwise, perform CB on all of the vertices in the tree, in order of addition 
to the tree, and output the bin to which ball vo is assigned to by CB. □ 

A reduction from various load balancing algorithms gives us the following corollar- 
ies to Theorem|5] 

Corollary 3. (Using Suppose we wish to allocate to balls into n bins of uniform 
capacity, m > n, where each ball chooses d bins independently and uniformly at ran- 
dom. There exists a (log n, log n, 1 jn) LCA which allocates the balls in such a way that 
the load of the most loaded bin is m/n + 0(log log nj log d) w.h.p. 

Corollary 4. (Using 4791/ ) Suppose we wish to allocate n balls into n bins of uniform 
capacity, where each ball chooses d bins independently at random, one from each of d 
groups of almost equal size 0(5). There exists a (log n, log n, 1/n) LCA, which allo- 
cates the balls in such a way that the load of the most loaded bin is In hm/(d— 1) In 2 + 
0(1) w.h.p.^ 

Corollary 5. (Using ^5^J Suppose we wish to allocate m balls into n < to bins, 
where each bin i has a capacity Ci, and ^ i q = to. Each ball chooses d bins at ran- 
dom with probability proportional to their capacities. There exists a (log n, log n, 1 jn) 
LCA which allocates the balls in such a way that the load of the most loaded bin is 
2 log log n + 0(1) w.h.p. 

Corollary 6. (Using §5§) Suppose we wish to allocate m balls into n < m bins, where 
each bin i has a capacity and J^. Cj = to. Assume that the size of a large bin is at 
least rn log n,for large enough r. Suppose we have s small bins with total capacity m s , 
and that m s = 0((n log n) 2 / 3 ). There exists a (log n, log n, 1 /n) LCA which allocates 
the balls in such a way that the expected maximum load is less than 5. 

Corollary 7. (Using fiE\l) Suppose we have n bins, each represented by one point on a 
circle, and n balls are to be allocated to the bins. Assume each ball needs to choose 
d > 2 points on the circle, and is associated with the bins closest to these points. There 
exists a (log n, log n, 1/n) LCA which allocates the balls in such a way that the load of 
the most loaded bin is In In nj In d + 0(1) w.h.p. 



11 In fact, in this setting the tighter bound is jj^j + O(l), where 4>d is the ratio of the d-step 
Fibonacci sequence, i.e. 4> d = lirrifc-,^ VF d (fc), where for k < 0, F d (k) = 0, F d (l) = 1, 
and for k > 1 F d (k) = £ti F d (k - i) 
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6.3 Random ordering 

In the above we assume that we are given a random ranking for each ball. If we are not 
given such random rankings (in fact, a random permutation of the vertices in U will also 
suffice), we can generate a random ordering of the balls. Specifically, since w.h.p. the 
size of the random query is 0(log n), an (9(log n)-wise independent random orderiw^ 
suffices for our local computation purpose. Using the construction in J2] of 1 /n 2 -almost 
0(logn)-wise independent random ordering over the vertices in U which uses space 
0(log 3 n), we obtain (0(log 3 n), 0(log 3 n), l/n)-local computation algorithms for 
balls and bins. 



See 12] for the formal definitions of fc-wise independent random ordering and almost fc-wise 
independent random ordering. 



18 



References 

[1] N. Alon. A parallel algorithmic version of the Local Lemma. Random Structures and 

Algorithms, 2:367-378, 1991. 
[2] N. Alon, R. Rubinfeld, S. Vardi, and N. Xie. Space-efficient local computation algorithms. 

In Proc. 23rd ACM-SIAM Symposium on Discrete Algorithms, pages 1 132-1 139, 2012. 
[3] Y. Azar, A. Z. Broder, A. R. Karlin, and E. Upfal. Balanced allocations. SIAM Journal on 

Computing, 29(1): 180-200, 1999. 
[4] J. Beck. An algorithmic approach to the Lovasz Local Lemma. Random Structures and 

Algorithms, 2:343-365, 1991. 
[5] R Berenbrink, A. Brinkmann, T. Friedetzky, and L. Nagel. Balls into non-uniform bins. 

In Proceedings of the 24th IEEE International Parallel and Distributed Processing Sympo- 
sium (IPDPS), pages 1-10. IEEE, 2010. 
[6] R Berenbrink, A. Czumaj, A. Steger, and B. Vocking. Balanced allocations: The heavily 

loaded case. SIAM J. Compuh, 35(6): 1350-1385, 2006. 
[7] A. Borodin and Ran El-Yaniv. Online Computation and Competitive Analysis. Cambridge 

University Press, 1998. 

[8] John W. Byers, Jeffrey Considine, and Michael Mitzenmacher. Simple load balancing for 
distributed hash tables. In Proc. of Intl. Workshop on Peer-to-Peer Systems(IPTPS), pages 
80-87, 2003. 

[9] D. Dubhashi and D. Ranjan. Balls and bins: A study in negative dependence. Random 
Structures and Algorithms, 13:99-124, 1996. 
[10] T.Harris. The Theory of Branching Processes. Springer, 1963. 

[11] S. Marko and D. Ron. Distance approximation in bounded-degree and general sparse 
graphs. In APPROX-RANDOM'06, pages 475^186, 2006. 

[12] M. Mitzenmacher, A. Richa, and R. Sitaraman. The power of two random choices: A 
survey of techniques and results. In Handbook of Randomized Computing, Vol. I, edited by 
P. Pardalos, S. Rajasekaran, J. Reif and J. Rolim, pages 255-312. Norwell, MA: Kluwer 
Academic Publishers, 2001. 

[13] H. N. Nguyen and K. Onak. Constant-time approximation algorithms via local improve- 
ments. In Proc. 49th Annual IEEE Symposium on Foundations of Computer Science, pages 
327-336, 2008. 

[14] R. Otter. The multiplicative process. Annals of mathematical statistics, 20(2): 206-224, 
1949. 

[15] M. Parnas and D. Ron. Approximating the minimum vertex cover in sublinear time and a 
connection to distributed algorithms. Theoretical Computer Science, 381(1-3), 2007. 

[16] R. Rubinfeld, G. Tamir, S. Vardi, and N. Xie. Fast local computation algorithms. In Proc. 
2nd Symposium on Innovations in Computer Science, pages 223-238, 2011. 

[17] R. Rubinfeld, G. Tamir, S. Vardi, and N. Xie. Fast local computation algorithms, 201 1. 

[18] K. Talwar and U. Wieder. Balanced allocations: the weighted case. In Proc. 39th Annual 
ACM Symposium on the Theory of Computing, pages 256-265, 2007. 

[19] Berthold Vocking. How asymmetry helps load balancing. /. ACM, 50:568-589, July 2003. 

[20] Y. Yoshida, Y. Yamamoto, and H. Ito. An improved constant-time approximation algo- 
rithm for maximum matchings. In Proc. 41st Annual ACM Symposium on the Theory of 
Computing, pages 225-234, 2009. 



19 



A Hypergraph two-coloring 

Recall that a hypergraph H is a pair H = (V, E) where V is a finite set whose ele- 
ments are called nodes or vertices, and E is a family of non-empty subsets of V, called 
hyperedges. A hypergraph is called k-uniform if each of its hyperedges contains pre- 
cisely k vertices. A two-coloring of a hypergraph H is a mapping c : V — > {red, blue} 
such that no hyperedge in E is monochromatic. If such a coloring exists, then we say 
H is two-colorable. We assume that each hyperedge in H intersects at most d other 
hyperedges. Let n be the number of hyperedges in H . Here we think of k and d as fixed 
constants and all asymptotic forms are with respect to n. By the Lovasz Local Lemma, 
when e(d + 1) < 2 k ~ 1 , the hypergraph H is two-colorable (e.g. [fl~)). 

Following 1171 . we let m be the total number of vertices in H. Note that m < kn, 
so m = 0(n). For any vertex x G V, we use £{x) to denote the set of hyperedges x 
belongs to. For any hypergraph H = (V, E), we define a vertex-hyperedge incidence 
matrix M. G {0, 1}™ IX ™ so that, for every vertex x and every hyperedge e, M x , e = 1 
if and only if e G £(x). Because we assume both k and d are constants, the incidence 
matrix M is necessarily very sparse. Therefore, we further assume that the matrix M. 
is implemented via linked lists for each row (that is, vertex x) and each column (that is, 
hyperedge e). 

Let G be the dependency graph of the hyperedges in H. That is, the vertices of 
the undirected graph G are the n hyperedges of H and a hyperedge E% is connected to 
another hyperedge Ej in G if Ei (1 Ej ^ 0. It is easy to see that if the input hypergraph 
is given in the above described representation, then we can find all the neighbors of any 
hyperedge Ei in the dependency graph G (there are at most d of them) in constant time 
(which depends on k and d). 

A natural question to ask is: Given a two-colorable hypergraph H, and a vertex v, 
can we quickly compute the coloring of vl Alon et al. gave (0) a polylog(n)-time 
and space LCA based on Alon's 3-phase parallel hypergraph coloring algorithm (ITJ), 
where the exponent of the logarithm depends on d. We get rid of the dependence on 
d (in the exponent of the logarithm) using the improved analysis of the query tree in 
section[3] together with a modified 4-phase coloring algorithm. 

Our main result in this section is, given a two-colorable hypergraph H whose two- 
coloring scheme is guaranteed by the Lovasz Local Lemma (with slightly weaker pa- 
rameters), we give a (0(log 4 n), 0(log 4 n), 1/n) - local computation algorithm. We 
restate our main theorem: 

Theorem 6. Let H be a k-uniform hypergraph s.t. each hyperedge intersects at most d 
other hyperedges. Suppose that k > 16 log d + 19. 

Then there exists an (0(log 4 n), 0(log 4 n), l/n)-local computation algorithm which, 
given H and any sequence of queries to the colors of vertices (x\, x%, . . . ,x s ), with 
probability at least 1 — 1 /n 2 , returns a consistent coloring for all Xi 's which agrees 
with a 2-coloring of H. Moreover, the algorithm is query oblivious and parallelizable. 

In fact, we only need: 

fc>3riogl6d(rf-l) 3 (rf+l)] + flog 2e(d+ l)] 
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Throughout the following analysis, we set: k\ — k, and 

h = ki-i - [log 16d(d - lf{d + l)] 

Notice that the theorem's premise simply implies that 2 fe4_1 > e(d + 1), as required by 
the Lovasz Local Lemma. 

A.l The general phase - random coloring 

In each phase we begin with subsets Vi and Ei of V and E, such that each edge contains 
at least fcj vertices. We sequentially assign colors at random to the vertices, as long as 
every monochromatic edge has at least fej + i uncolored vertices. Once the phase is over 
we do not change this assignment. 

If an edge has all of its vertices besides fcj+i colored in one color, it is labeled 
dangerous. All the uncolored vertices in a dangerous edge are labeled saved and we do 
not color them in this phase. We proceed until all vertices in Vi are either red, blue, or 
saved. Let the survived hyperedges be all the edges that do not contain both red and 
blue vertices. Each survived edge contains some vertices colored in one color, and at 
least ki+i saved vertices. 

Let Si be the set of survived edges after a random coloring in Phase i, and con- 
sider G\si, the restriction of G to Si The probability that G\s i contains a connected 
component of size d 3 u at most |Vi|2 _u (fl~|). In particular, after repeating the random 
coloring procedure ti times, there is no connected component of size greater than d 3 u 
with probability 

(\Vi\2- u ) U 

If the query vertex x has been assigned a color in the z-th phase, we can simply 
return this color. Otherwise, if it is a saved vertex we let Cj (x) be the connected com- 
ponent containing x\nG\s i - Finally, since the coloring of Ci(x) is independent of all 
other uncolored vertices, we can restrict ourselves to Ei + i = Ci(x) in the next phase. 

A.2 Phase 1: partial random coloring 

In the first phase we begin with the whole hypergraph, i.e. Vi = V, E\ = E, and 
k\ = k. Thus, we cannot even assign a random coloring to all the vertices in sublinear 
complexity. Instead, similarly to the previous sections, we randomly order the vertices 
of the hypergraph and use a query tree to randomly assign colors to all the vertices that 
arrive before x and may influence it. Note that this means that we can randomly assign 
the colors only once. 

If x is a saved vertex, we must compute E<z = C\{x), the connected component 
containing x in G\s 1 - Notice that the size C\(x) is bounded w.h.p. 

Pr [|Ci(x)| > 4d 3 logn] < n2~ 41ogn = 

In order to compute C\(x), we run BFS on G\s 1 ■ Whenever we reach a new node, 
we must first randomly assign colors to the vertices in its query tree, like we did for x's 
query tree. Since (w.h.p.) there are at most 0(log n) edges in C\ (x), we query for trees 
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LCA for Hypergraph Coloring 

Preprocessing: 

1. Generate O(logn) independent ensembles, consisting each of 0(logn)-wise inde- 
pendent random variables in {0, l} m 

2. Generate tt, a ^-almost log 2 n-wise independent random ordering over [m] 

Phase 1: 

Input: a vertex x 6 V 
Output" a color in { red, blue} 

1. Use BFS to find the query tree T rooted at x, based on the ordering tt 

2. Randomly color the vertices in T according to the order defined by n 

3. If x is colored red or blue, return the color 

4. Else: 

(a) Starting from £(:eJ3 mn BFS in G]sJ3 in order to find the connected component 
E2 — Ci(x) of survived hyperedges around x 

(b) Let V2 be the set of uncolored vertices in E2 
Run Phase 2 Coloring^, E 2 ,V 2 ) 

" Recall that £ (x) is the set of hyperedges containing (a;). 
* Si denotes the set of survived hyperedges in E. 



Fig. 1. Local computation algorithm for Hypergraph Coloring 



of vertices in at most 0(log n) edges. Therefore in total we color at most O (log 2 (n)) 
vertices. 

Finally, since we are only interested in O (log 2 (n)) vertices, we may consider a 
coloring which is only O (log 2 (n)) -wise independent, and a random ordering which is 
only n~ 3 -almost O (log 2 (n)) -independent. Given the construction in 0, this can be 
done in space and time complexity O (log 4 (n)) 



A.3 Phase 2 and 3: gradually decreasing the component size 

Phase 2 and 3 are simply iterations of the general phase with parameters as described 
below. With high probability we have that \E%\ < 4d 3 logn, and each edge has k-i 
uncolored vertices. After at most £2 = log n repetitions of the random coloring proce- 
dure, we reach an assignment that leaves a size 2c? 3 log log n-connected component of 
survived edges with probability 

((4d 3 k 2 logn)2- 21 °s lo s") 10S " <n~ 3 

Similarly, in the third phase we begin with \E^\ < 2d 3 log log n, and after £3 = logn 
repetitions we reach an assignmi 
survived edges with probability 



repetitions we reach an assignment that leaves a size log }° s - -connected component of 



'2d 3 k 3 log log n) 2 ^ ) < ?i~ 3 



22 



Phase i Coloring{x,Ei,Vi) i £2,3 
Input: a vertex x 6 Vi and subsets Ei C. E and V* C V 
Output" a color in { red, blue} or M/L 

1 . Repeat the following log n times and stop if a good coloring is founcfl 

(a) Sequentially try to color every vertex in Vi uniformly at random 

(b) Explore the dependency graph of G\si 

(c) Check if the coloring is good 

2. If x is colored in the good coloring, return that color 
Else 

(a) Compute the connected connected component d(x) = Ei+i and then also Vi+i 

(b) Run Phase i + 1 Coloring^, Ei+i, V»+i) 

" Following 1171 . let Si be the set of survived hyperedges in Ei after all vertices in Vi are 
either colored or are saved. Now we explore the dependency graph of Si to find out all 
the connected components. 

We say a Phase 2 coloring is good if all connected components in G\s 2 have sizes at 
most 2d? log log n. 

Similarly, we say a Phase 3 coloring is good if all connected components in G\s 3 have 
sizes at most log '° B " . 

«4 



Fig. 2. Local computation algorithm for Hypergraph Coloring: Phase 2 and Phase 3 



Phase 4 Coloring(a;, Ea, Vi) 

Input: a vertex i £ Vi and subsets E4 C E and V4 C V 
Output" a color in { red, blue} 

1. Go over all possible colorings of the connected component Vi and color it using a 
feasible coloring. 

2. Return the color c of x in this coloring. 



Fig. 3. Local computation algorithm for Hypergraph Coloring: Phase 4 
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A.4 Phase 4: brute force 

Finally, we are left with a connected component of \E&\ < log ^° s - , and each edge has 
ki uncolored vertices. By the Lovasz Local Lemma, there must exists a coloring (see 
e.g. Theorem 5.2.1 in 0]). We can easily find this coloring via brute force search in 
time 0(log n). 

B fc-CNF 

As another application, our hypergraph coloring algorithm can be easily modified to 
compute a satisfying assignment of a fc-CNF formula, provided that the latter satisfies 
some specific properties. 

Let H be a fc-CNF formula on m Boolean variables x\, ■ . ■ , x m . Suppose H has n 
clauses H = Ax A • ■ • A A n and each clause consists of exactly fc distinct literals F^l 
We say two clauses Ai and Aj intersect with each other if they share some variable (or 
the negation of that variable). As in the case for hypergraph coloring, fc and d are fixed 
constants and all asymptotics are with respect to the number of clauses n (and hence 
m, since m < kri). Our main result is the following. 

Theorem 7. Let H be a k-CNF formula with fc > 2. Suppose that each clause inter- 
sects no more than d other clauses, and furthermore suppose that fc > 16 log d + 19. 
Then there exists a (0(log 4 n), 0(log 4 n), l/n)-local computation algorithm which, 
given a formula H and any sequence of queries to the truth assignments of variables 
(xi, X2, ■ ■ ■ , Xs), with probability at least 1 — 1/n 2 , returns a consistent truth assign- 
ment for all Xi's which agrees with some satisfying assignment of the k-CNF formula 
H. Moreover, the algorithm is query oblivious and parallelizable. 

Proof [Sketch]: We follow a 4-phase algorithm similar to that of hypergraph two- 
coloring as presented in appendix |A] In every phase, we sequentially assign random 
values to a subset of the remaining variables, maintaining a threshold of fcj unassigned 
variables in each unsatisfied clause. Since the same (in fact, slightly stronger) bounds 
that hold for the connected components in the hyperedges dependency graph also hold 
for the clauses dependency graph ( ifTTI ). we can return an answer which is consistent 
with a satisfying assignment with probability at least 1 — 1/n 2 . □ 

C Lower bound on the size of the query tree 

We prove a lower bound on the size of the query tree. 

Theorem 8. Let G be a random graph whose vertex degree is bounded by d > 2 
or distributed independently and identically from the binomial distribution: deg(v) ~ 
B(n,d/n) (d > 2). Then 

Pr[|T| > log nj log log n] > 1/n, 

13 Our algorithm works for the case that each clause has at least k literals; for simplicity, we 
assume that all clauses have uniform size. 
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where the probability is taken over all random permutations 7T S II of the vertices, and 
T is the largest query tree in G ( under ix). 

Proof. For both the bounded degree and the binomial distribution cases, there exists a 
path of length at least k = log n/ log log n in the graph w.h.p. Label the vertices on the 
path Vi, i>2j • • • ) v k- There are fc! possible permutations of the weights of the vertices 
on the path. The probability of choosing the permutation in which w(vi) < w{v2) < 
...< (v k ) is 1/kl 

kl = (log n/ log log n)! 

< (log77./loglogn) los ™/ loslosn 

< n. 



Therefore, 1/kl > 1/n and so the probability of the query tree having size 
log n I log log n is at least 1/n. 



□ 



